By Shelley Wunder-Smith
Four students from Georgia Tech’s Master of Science in Analytics program won first place in Purdue University’s 2022 Data 4 Good Case Competition, surpassing more than 150 other teams.
Sri Ravi Teja Kolipakula (MSA 23), Sanchita Porwal (MSA 23), Harsha Vaddi (MSA 23), and Varun Vankineni (MSA 23) used data to solve a problem related to image captioning in multiple languages.
“We were excited about using data to solve the problem of businesses’ online visibility in other countries where English is not the native language,” Vaddi explained. “Prior research has shown that businesses advertising in these local languages often have poor online responses in the international market. This makes it difficult for them to perform well, economically speaking.”
The teams participating in the competition were asked to employ a data-based approach using image-captioning models to generate advertisement captions in three target languages: Hausa (spoken in West and Central Africa), Thai (the primary language of Thailand), and Kyrgyz (the national language of Kyrgyzstan).
The MSA team created a model using Meta’s NLLB translator and OpenAI’s Multilingual CLIP model, both of which were state-of-the-art when the competition took place. They also used a number of Hugging Face models. (Hugging Face is an open-source platform that enables developers to collaborate around machine learning and AI.)
“The judges seemed impressed with our approach and our use of pretrained models, instead of building models from scratch,” Vaddi noted. “The case competition helped us build greater confidence in our skills as data scientists.”
Visit the Data Science 4 Good Competition site for information on the 2024 challenge.